Generative model

In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is inconsistent,^[a] but three major types can be distinguished, following Jebara (2004):

A generative model is a statistical model of the joint probability distribution $P(X,Y)$ on a given observable variable X and target variable Y;^[1] A generative model can be used to "generate" random instances (outcomes) of an observation x.^[2]
A discriminative model is a model of the conditional probability $P(Y\mid X=x)$ of the target Y, given an observation x. It can be used to "discriminate" the value of the target variable Y, given an observation x.^[3]
Classifiers computed without using a probability model are also referred to loosely as "discriminative".

The distinction between these last two classes is not consistently made;^[4] Jebara (2004) refers to these three classes as generative learning, conditional learning, and discriminative learning, but Ng & Jordan (2002) only distinguish two classes, calling them generative classifiers (joint distribution) and discriminative classifiers (conditional distribution or no distribution), not distinguishing between the latter two classes.^[5] Analogously, a classifier based on a generative model is a generative classifier, while a classifier based on a discriminative model is a discriminative classifier, though this term also refers to classifiers that are not based on a model.

Standard examples of each, all of which are linear classifiers, are:

generative classifiers:
- naive Bayes classifier and
- linear discriminant analysis
discriminative model:
- logistic regression

In application to classification, one wishes to go from an observation x to a label y (or probability distribution on labels). One can compute this directly, without using a probability distribution (distribution-free classifier); one can estimate the probability of a label given an observation, $P(Y|X=x)$ (discriminative model), and base classification on that; or one can estimate the joint distribution $P(X,Y)$ (generative model), from that compute the conditional probability $P(Y|X=x)$ , and then base classification on that. These are increasingly indirect, but increasingly probabilistic, allowing more domain knowledge and probability theory to be applied. In practice different approaches are used, depending on the particular problem, and hybrids can combine strengths of multiple approaches.

Cite error: There are <ref group=lower-alpha> tags or {{efn}} templates on this page, but the references will not show without a {{reflist|group=lower-alpha}} template or {{notelist}} template (see the help page).

^ Ng & Jordan (2002): "Generative classifiers learn a model of the joint probability, $p(x,y)$ , of the inputs x and the label y, and make their predictions by using Bayes rules to calculate $p(y\mid x)$ , and then picking the most likely label y.
^ Cite error: The named reference mitchell2015generative was invoked but never defined (see the help page).
^ Cite error: The named reference mitchell2015discriminative was invoked but never defined (see the help page).
^ Jebara 2004, 2.4 Discriminative Learning: "This distinction between conditional learning and discriminative learning is not currently a well-established convention in the field."
^ Ng & Jordan 2002: "Discriminative classifiers model the posterior $p(y|x)$ directly, or learn a direct map from inputs x to the class labels."

[ngjordan2002generative-2] Ng & Jordan (2002): "Generative classifiers learn a model of the joint probability, $p(x,y)$ , of the inputs x and the label y, and make their predictions by using Bayes rules to calculate $p(y\mid x)$ , and then picking the most likely label y.

[mitchell2015generative-3] Cite error: The named reference mitchell2015generative was invoked but never defined (see the help page).

[mitchell2015discriminative-4] Cite error: The named reference mitchell2015discriminative was invoked but never defined (see the help page).

[5] Jebara 2004, 2.4 Discriminative Learning: "This distinction between conditional learning and discriminative learning is not currently a well-established convention in the field."

[6] Ng & Jordan 2002: "Discriminative classifiers model the posterior $p(y|x)$ directly, or learn a direct map from inputs x to the class labels."

[a]

[1]

[2]

[3]

[4]

[5]